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ABSTRACT 

We create a variety of new quantum algorithms that use Grover's algorithm and similar 
techniques to give polynomial speedups over their classical counterparts. We begin by 
introducing a set of tools that carefully minimize the impact of errors on running time; 
those tools provide us with speedups to already-published quantum algorithms, such as 

improving Diirr, Heiligman, H0yer and Mhalla's algorithm for single-source shortest 
paths F by a factor of lg N . The algorithms we construct from scratch have a range of 
speedups, from 0(E) — > 0(y/VE lg V) speedups in graph theory to an 0(N 3 ) — > 0(N 2 ) 

speedup in dynamic programming. 



1 Introduction 



This paper introduces several new quantum algorithms which are polynomially faster than 
their classical counterparts. We introduce these through the use of Grover's algorithm and 
its descendants as introduced by Boyer, Brassard, Hdyer and Tapp (modified in Appendix 

and Buhrman, Cleve, de Wolf and Zalkaj21ISl- We begin by introducing some basic tools, 
such as minimum-finding, that use Grover's directly; in the construction of those tools we 
pay particular attention to the probability with which they fail, and make their running 
time depend as little as possible on the desired probability of failure. 

After introducing our tools we cast our gaze over several fields, striving to address a 
variety of classical algorithms, especially those that are illustrative of a particular problem 
type. We find 0(y/E/V) improvements in some important graph theory algorithms, and 
also examine some already-published quantum algorithms in graph theory giving them 
logarithmic speedups by improving how they deal with errors. After that we examine some 
algorithms in computational geometry and dynamic programming, where we find perhaps 
our most impressive individual results: O(N) and 0(y/~N) improvements over the best- 
known classical algorithms. For completeness' sake, we include an appendix of comments 
and caveats ( Appendix [Bj), which contains a section on some of the notation used here with 
which physicists might be unfamiliar. 

For a summary of our algorithms' running times compared to those for classical solutions 
to the same problems, please see our conclusions in sectional 

2 Grover's algorithm 

We make extensive use of descendants of Grover's search algorithmic]. Grover's algorithm 
works as follows: we are given a binary function (one that returns only or 1), F, over a 
domain of size N, with only one value for x such that F(x) = 1 (we will call such values 
"solutions for F"). Grover found that it took just 0(y/~N) calls to F to find a value x 
such that F(x) = 1. To find such a value of x classically, assuming no knowledge of the 
properties of F, would take O(N) calls to F. Since its initial introduction by Grover, 
several improvements have been made to the algorithm; here we restate the results we will 
use, which we will refer to by the initials of their authors: 

• BBHT: If there are M > solutions to F in the domain (we do not need to know M), 
the BBHT 2\ search algorithm returns one random such element after 0(w N/M) calls 
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to F. There is probability ~ .5M that it will fail, returning the special value false 
after 0(VN) calls to F. If M = 0, it returns false in 0(y/N) calls to F. Note that in 
their original paper, Boyer, Brassard, H0yer and Tapp do not discuss the probability 
of failure and the M = case in depth; we do so in Appendix [UJ 

• BCWZ: The BCWZ^Q search algorithm is passed a parameter e _1 and returns a 
random solution to F after 0(y/ 'iVTge -1 ) calls, provided that such a solution exists. 
There is a probability e that it will fail, in which case it returns false. If M = 0, it 
returns false in 0(y / A^lge _1 ) calls to F. 

3 Algorithmic tools 

Here we present some basic algorithms, founded on the above primitives, that serve as 
subroutines to be used throughout this paper (where they will be referred to by their ab- 
breviated names, found in the subsection headers). We begin by noting that if an algorithm 
is to be run R times, and we want it to succeed all R times with some constant probability, 
the algorithm must have probability e < 1 /R of failure. Because of this, we will sometimes 
talk about e _1 being polynomial, and we carefully formulate algorithms in this section to 
minimize the dependence of running time on e. 

Please note that each of the following functions operates with some given function F, 
whose evaluation could have some arbitrary time complexity; as such, our unit of time for 
this section is "calls to F." Where there are terms in the complexity of a tool that do not 
depend on F's running time, the function t(F), denoting F's running time, will appear in 
the analysis of the tool. 

3.1 Checking for a solution to F, findsol 

Theorem 1 Take a function F over a domain of size N. The following algorithm findsol 
determines whether there is a solution x in the domain such that F(x) = 1, in 0(yj N/M + 
y/N lge-Uf- 1 ' 86 ) calls to F on average when there are M solutions, and in 0(y/N\ge 1 ) 
calls to F on average when there are none. If there are solutions, findsol returns a random 
one with probability > 1 — .5M -L86 e; if there is no solution or if it fails, it returns the 
special value false after 0(y/N lge -1 ) calls to F. 

In the following we use an extra parameter r, which we could never quite find a use for 
in the remainder of our paper. We include it as a parameter here in case someone else is 
subject to greater inspiration. 
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The principle we use here is very straightforward. First, we acknowledge that we can't 
do any better than y/Nlg er 1 (a single BCWZ) in the case where there are no solutions, so 
we try to optimize for the case where there are solutions and we can hope for 0(y/ N/M) 
calls to F. To do this, we try BBHT first, due to its faster running time. Then if we have 
not found a solution, we check for one with BCWZ to make sure. 

1. Run BBHT up to r times. If any of those returns a result that satisfies F, immediately 
return that result. 

2. Run BCWZ with parameter e . If it returns a result that satisfies F, return that 
result; otherwise return false. 

The analysis for this is very straightforward. If there are solutions, step ^ takes an 
average of 0(2y N/M) calls to F (it repeats less than twice on average). That fails with 
probability 0(.5 r M _ 93r ); if it does we move on to step [2] which takes \J N lge -1 calls to 
F. This gives us a total of 0(2y/N/M + .5 r M"- 93r ^JN lg e 1 ) average calls to F in the case 
where there are solutions; these reduce to to the promised quantities when r = 2. If there 
are no solutions, step His 0(ry/~N) and step [2] is 0(-\/N lge -1 ). 

Looking at the probability of failure, we observe that the algorithm cannot possibly find 
a solution that does not exist, and therefore cannot fail when there are no solutions. If there 
are solutions, the probability of failure is < .5 r M~' 93r e, the probability that the BBHTs 
and BCWZ all fail. 

We chose r = 2 because 2 is the smallest value that gives us a probability of error 
proportional to less than M , and thus it typically minimizes running time given that 
condition. Almost any constant is a reasonable choice for r. 

3.2 Minimum finding, minfind 

Theorem 2 Take a function F over a domain of size N. The following algorithm minfind 
finds x in the domain such that F(x) is minimized, in expected time O f -y/ 'N lg e -1 ^ and 
with probability e of failure. 

This algorithm is based on one by Diirr and H0yerf5J. The motivation for this algorithm, 
as with theirs, is repeatedly to find y with smaller and smaller values for F(y). To do this 
efficiently, we use findsol as introduced in section 13.11 

1. Pick y uniformly at random from the domain of F. 
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2. Repeat the following until instructed to return: 

(a) Run findsol with parameter e _1 to find an element y' : F{y') < F(y). 

(b) If findsol returns an element, set y = y'; otherwise return y. 

Diirr and H0yer show that the probability of reaching the k th lowest value is 1/k, and 
that for different k, those probabilities are independent. With that in mind, we can sum 
over all values of k to arrive at an average running time and a probability of failure. For 
running time, we find: 

; N i ; 

tminfind = VN\ge^ + £ - y/ N lg rtT 1 " 86 

k=2 K 

< Tiv^g^+^^yMiFT^ 1 - 86 

< y/N]ge~ 1 + VNlge- 1 

calls to F. We calculate the probability of failure similarly, first noting that Pf a u < 

3.3 Finding all x that satisfy F, findall 

Theorem 3 Take a binary function F over a domain of size N, in which there are M 
different parameters (solutions) that satisfy F. The following algorithm findall finds all x 
for which F(x) = 1, in 0(y/NM + y/Nlge^ 1 ) calls to F on average, with probability e of 
failure. 

The idea behind this algorithm is to find successive solutions x, striking each off the 
search as we find it in order to guarantee that we find something different every time. We 
do this straightforwardly with findsol. 

1. Create a hash table H to store results found so far. 

2. Repeat the following until instructed to return: 

(a) Run findsol with parameter e _1 to find an element that satisfies F but is not in 
H (has not been found yet). 
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(b) If findsol returns an element, add it to the result set and H; otherwise, return 
the result set. 



We calculate the running time with a straightforward integral. 



M 



tfindaii = VNlge-i + J2 (VN/k + AT^V^lge- 1 )) 



k=l 



2 V / Nlge- 1 + / dA; ( + Ar^ViVlg e" 1 ) 



« 2^/Nlge- 1 + VNM + ^N\ger 1 



calls to F. We calculate the probability of failure similarly, noting that Pf a u < Ylk Pfail(k)'- 



Hash tables, while a useful construct, are a somewhat thorny topic in algorithms: specif- 
ically, for any hash function there is some sequence of objects to be hashed that leads to 
repeated collisions, causing bad asymptotic behaviour. In cases where findall will be called 
multiple times, as in section 14.11 in order to avoid the difficulties associated with using a 
hash table we can replace H here with a simple array. The initialization time for the array 
is O(N) where N is the largest value of N with which findall will be called. Every time we 
run findall we fill H up in the obvious way, keeping track of which entries we filled up in a 
queue and then wiping them after. 

3.4 Finding a minimal d objects of different types, mindiff 

Suppose that we want to book d holidays to different destinations, and there are TV" flights 
yi leaving our home airport to various destinations G(yi), with various costs F(yi). The 
following algorithm finds us the d cheapest destinations, and their respective cheapest flights. 

Theorem 4 Take a function F over a domain of size N, and another function G over the 
same domain. The following algorithm mindiff finds d elements of the domain Xi such that 
F{xi) is minimized given that all G{xi) are distinct. More formally, given the result set 
of mindiff, Xi, there exists no y that can "improve" the result set, by meeting either of the 
following conditions: 

1. F(y) < F(xi) and G(y) = G(xi) for some i. This means flight y goes to G(xi) and is 
cheaper than Xj. 




k=l 
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2. F(y) < F{xi) for some i, G(y) ^ G(xj) for any j. This means G(y) is a cheaper 
destination than one of the G{xi) — actually it means that y is a cheaper flight than 
the cheapest flight we've seen so far that goes to G(xi). 

mindiff achieves this in O (^(t(F) + t(G)) (^VNd + \J N lge -1 ^ + d\gN lg , with probabil- 
ity e of failure. 

The basis for this algorithm comes from Diirr, Heiligman, H0yer and MhallapQ, who in 
their paper outline a procedure that we expound in step|3]below. The principle behind both 
this algorithm and theirs is repeatedly to find y such that it meets either of the conditions 
above, and to replace the appropriate element of the result set with the new y. 

1. Let x be the array of answers. Initially, let the x[i] be "infinities," for which F(x[i]) = 
do, and is unique and not equal to G(y) for any y in the domain of F and G. 

2. Let H be a hash table mapping G(x[i]) to i, and initialize it as such. Let T be a 
balanced binary search tree containing the pair (F(x[i\), i) for all i, sorted by i^x^]), 
and initialize it as such. 

3. Repeat the following until F has been evaluated 0(V Nd) times, or the loop has 
repeated 0{d\gN) times (whichever happens first): 

(a) Let r be the largest F(x[k]) in T, and k the corresponding index. 

(b) Use BBHT to find some element of the domain y such that either F(y) < r and 
G(y) i H (condition H, or G(y) G H and F(y) < F(x[H(G(y))]) (condition QJ. 
Note that F(x[H(G(y))]) is the cost of the cheapest flight that we have found so 
far going to y's destination, if that is currently in our result set. 

(c) If condition^was met, set x[H(G(y))] = y, and update H and T correspondingly. 
Otherwise, if condition [2 was met, set x[k] = y, and update H and T accordingly. 

4. Run findsol with parameter e _1 to check whether there is still a y that satisfies either 
condition as outlined in step I3bl If not, return x. If so, repeat step 01 

Terminating the loop in step |21 after 0(V dN) calls to F provides probability of suc- 
cess > ^, which is shown by Diirr, Heiligman, H0yer and Mhalla. They also show that 
O(d) iterations suffice to eliminate a constant fraction of the domain from consideration, 
thus 0(dlgN) iterations will also provide probability of success > \- In order to im- 
prove the probability of success, we run findsol with parameter e~ l to check whether we 
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are yet done; if we are not, we repeat step until we are. Since the probability for step 
13 to finish successfully after one pass is > ^, we expect to repeat it - and findsol - an 
average of < 2 times. We also have to consider the contribution of updating and access- 
ing T, which will take 0(\gd) time with every iteration; thus our total running time is 
O ((t(F) + t(G)) (ydN + y/N lge" 1 ) +dlgNlgd\ with probability 1 - e of success. 

Note that if d is greater than the number of distinct values for G (= 7), we return 7 
valid elements and d — 7 infinities (fictitious elements of the domain as defined in step ^) . 

As with findall, we use a hash table here that can be replaced by an array if mindiff is 
going to be used multiple times. 

4 Graph algorithms 

A graph is a mathematical construct made up of a set of vertices v a , and a set of edges e a b 
that connect the vertices together. Typically one thinks of the vertices as locations and the 
edges as connections between them: for example, one could represent bus stops in a city as 
the vertices of a graph, and the paths of buses as the edges connecting them. Graphs are 
widely applicable throughout the field of algorithms, sometimes showing up in unexpected 
places as useful constructs to solve problems. 

Each edge in a graph connects two vertices v a and Vb, and is either directed (v a — > Vb) or 
undirected (v a «-> Vf,); typically graphs contain only directed or only undirected edges. In a 
weighted graph edges have some weight associated with them, typically thought of as a cost 
or distance associated with moving from v a to Vb (and vice- versa in the undirected case). 
An unweighted graph can be thought of as a weighted graph whose edge- weights are all 1. 

With the concept of edges having some cost or length, we can discuss problems such as 
shortest paths: given a graph, what is the "shortest" path - the path of minimal summed 
length - from some source vertex to some destination vertex, or possibly to every destination 
vertex? Suppose we want the shortest paths from every vertex to every other vertex: can 
we calculate them faster than we can by running our single-source shortest paths algorithm 
from each source? What if some of the edges have negative weights: are our algorithms 
affected? 

In this section we will focus on quantum versions of long-studied classic problems such 
as shortest paths, searching through graphs, and graph matchings (suppose you want to 
pair up vertices that are connected; what's the maximum number of pairs you can make?). 

We present the algorithms here for two models of representing graphs, both of which 
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we will assume are given to us as quantum black boxes. In both models, V is the number 
of vertices and E the total number of edges in the graph; V and E represent the vertex set 
and edge set respectively. If there is an edge between vertices Vi and Vj, we refer to it as 
eij. The models are: 

• The adjacency matrix model, as a quantum black box, is passed i,j (0 < i,j < 
V) and returns whether dj exists. Conceptually this could be determined by some 
mathematical function, but classically the graph is usually represented as a V x V 
matrix with entries in {0, 1}. 

• The edge list model, as a quantum black box, is passed i, j and returns the destination 
of the j th edge outgoing from vertex vi (we assume for convenience that we know how 
many edges are outgoing from each vertex). Classically this is usually represented as 
a ragged array, but sometimes is generated mathematically as- needed. We call the 
set of edges outgoing from Vi d[i], and its cardinality \d[i]\. The edge list model is 
sometimes called the adjacency array model. 

If the graph is weighted, the adjacency matrix and edge list models also return the 
weight of the edge queried. 

For an excellent resource on graph theory and algorithms therein, please see Cormen, 
Leiserson, Rivest and Stein's classic introduction to algorithms 7^ . It contains detailed 
discussions of breadth-first and depth-first searches, Dijkstra's algorithm and the Bellman- 
Ford algorithm, as well as all-pairs shortest paths. We look at all of these in this section, 
but leave the details to this reference. 

In this section, we assume that the desired probability of failure e is such that e _1 is 
polynomial in the number of vertices V . Note that the number of edges E can be no more 
than OiV 2 ) for the graphs we will be discussing here (see Appendix 0, so "polynomial in 
V" "polynomial in E." The error analysis for this section can be found in Appendix 

ia~ti 

4.1 Breadth- first search, BFS 

Breadth-first and depth-first search are two of the simplest algorithms for searching a graph, 
and find extensive use inside many important graph algorithms. The principle behind each 
is the same: starting at some source, we systematically explore the vertices of our graph, 
"visiting" each vertex connected to the origin in some order. By introducing quantum 
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versions of each here, we tarnish their simplicity but maintain their strength and increase 
their speed. 

As we mentioned above, BFS and DFS both see extensive use. Both can be used to 
determine whether a vertex is connected to the rest of the graph, and breadth-first search 
in particular can be used to compute shortest paths in an unweighted graph. Depth-first 
search, on the other hand, can be used to detect "bridges" in a graph: edges which, if they 
were removed, would sever the graph into two pieces with no edges between them. There 
is a great deal of utility to be had from these two over and above what is discussed here, 
and both are very simple, solved problems in classical computing. 

To implement a breadth-first search here, we take an approach based heavily on classical 
BFS: we keep a list of vertices we want to visit, and every time we visit another of those 
vertices we add all of its unvisited neighbours to the list. Through use of a boolean array 
we ensure each vertex is only visited and added once. To choose the order in which the 
vertices are visited, we let our list be a "queue," wherein vertices added first are visited 
first; thus we end up visiting the vertices in order of how close they are to the origin of our 
search (breadth-first). To speed up the process of finding all of the unvisited neighbours of 
each node, we use section 13. 31 s findall. This algorithm is based on a BFS from Ambainis 
and Spalekjl], though they use repeated BBHTs rather than our findall. 

Theorem 5 The following algorithm BFS executes a breadth-first search through a graph 
G = (V, E) in 0(y/V 3 lg V) time in the matrix model, 0(y/VE IgV) in the edge list model. 

1. Let the vertex from which we are searching be called v a . Let there be a queue of 
vertices q, and let it initally contain only v a . Let there be a boolean array vis of size 
V, with entries vis[i] = 6i >a . 

2. Repeat the following until q is empty: 

(a) Remove the first element of q and call it 

(b) Visit V{. 

(c) Using section EOl s findall, find all neighbours Va of V{ with fis[j] = false. 

(d) For each such Vj, set vis[j] = true and add Vj to q. 

In the matrix model, each vertex Vi is processed at most once and contributes y/Vnl + 
y/Vlg V, where rtj is the number of elements added to q. In the edge list model, each vertex is 
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processed at most once and contributes \/|<2[i]| n>i + lg \d[i]\. By the Cauchy-Schwartz 

inequality, we have: 



E V^Wl < /E %/E ^ V^TE (4.1) 



E vlWsS < ,/E i d wyE wmi ^ y^w^ (4.2) 

Thus in the edge list model runs in 0(\/VE lg V), and since £7 < U 2 , BFS in the 



matrix model runs in 0(^/V 3 lg V). Classically breadth-first search takes 0(E) time, so 
-BFS 1 is faster than its classical counterpart for E E fi(Vlg V). 

4.2 Depth-first search, DFS 

Classically, depth-first and breadth-first search can have very similar implementations, and 
the same is true in the quantum regime. The simplest implementation of depth-first search 
in both regimes, however, is a recursive one, which we show here. 

Theorem 6 The following algorithm DFS executes a depth-first search through a graph 



G = (V, E) in 0(\/V 3 lg V) time in the matrix model, 0(y/VE lg V) in the edge list model. 

1. Let the vertex from which we are searching be called v a . Let there be a boolean array 
vis of size V, with entries vis[i] = 0. Call DFS-BODY(u a ). 

2. Function DFS-BODY(vertex v k ): 

(a) Visit Vk- Set vis[k] = true. 

(b) Use section I3.1t s findsol to find a neighbour of that has not yet been visited, 

Vi. 

(c) If there is some such vf. 

i. Recursively call DFS-BODY(^). 

ii. After returning from the recursive call, go back to step I2bl 

3. Return. 

There are two contributions to our running time here, which we will work through in the 
edge list model. The first is that for each vertex visited, findsol must fail once, leaving us 
with a contribution of 0(^JVE lg V) (see equation I4.2|) . The second contribution is the sum 
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of the running times of the successful findsob. We sum again over vertices, noting that for 
each vertex Vi, if we end up finding rtj of its neighbours through DFS-BODY(f j), the run- 
ning time of that will be O (j2k=i / k ) l s\d[i\\J , and therefore 0(^/|cZ[z]| n; lg|d[i]|). 
Summing that contribution over each vertex, we again arrive at 0{\JVE lg V) through 
equation 14.21 In the matrix model we simply replace E with V , arriving at 0{^/WXgV). 

Classically depth-first search takes 0{E) time, so DFS is faster than its classical coun- 
terpart for E £ fi(VlgV). 

4.3 Single-source shortest paths with negative edge weights, SPNW 

The problem of single-source shortest paths, finding the shortest paths through a graph 
from some source v a to all destinations, is solved elegantly by Diirr, Heiligman, H0yer and 
Mhalla 1 with an algorithm loosely based on Dijkstra's; their algorithm does not allow 
negative edge weights, so here we base an algorithm on Bellman-Ford, which does[81l91llL)|. 
Our algorithm returns an array of shortest distances to points, or the special value false if 
there exists a negative-weight cycle in the graph that can be reached from the source. It 
also computes an array from, whose i th element is the index of the vertex previous to V{ on 
the shortest path from v a to vf, this allows the shortest path from v a to Vi to be recovered. 

Intuitively, we are going to take each edge in turn and see if it helps our current shortest 
path to each point; we repeat that process V times, at which point each edge will have 
helped all it can. 

Theorem 7 Given a graph G = (V, E), the following algorithm SPNW returns an array 
whose i th element is the shortest distance from the source v a to vertex Vi, oo if no such path 
exists. If there is a negative weight cycle that can be reached from v a , instead of an array 
it returns the special value false. It does this in 0{^/V^ lg V) time in the matrix model, 
0(\/V 3 E lg V) in the edge list model. 

1. If we are using the edge list model, set up an array / such that /[£][?'] is the source of 
the j edge incident on i. 

2. Initialize an array dist, such that dist[i] = oo for i / a, for i = a. 

3. Initialize an array from, such that from[i] = — 1. 

4. Repeat the following V — 1 times: 
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(a) For each vertex Vi , using the algorithm of section 13.21 minfind a vertex Vj such 
that e,j exists, and dist[j] + length(ejj) is minimized. Execute the minfind by 
searching over f[i] in the edge list model, V in the matrix model. 

(b) If dist[j]+length(eji) < dist[i], set dist[i] = dist[j] +length(ejj) and set from[i] = 
3- 

5. Repeat step I4al one more time. If it changes dist, return false. Otherwise return dist. 

This algorithm, like Bellman-Ford, works due to the fact that all shortest paths in a 
graph without negative weight cycles must use fewer than V edges. Each time through 
step 01 we ask "could the path to vertex v i be shorter if we were allowed to use one more 
edge?" Repeating this V — 1 times lets us use V — 1 edges, and repeating it a last time 
lets us check whether there is a negative weight cycle. Meanwhile we keep our array from, 
which tells us how we got to Vi and allows us to recover the whole path. In the edge list 
model, the running time is V^2i \/\d[i]\ lg \d[i]\ = 0(y / V 3 E lg V) by equation 14.21 In the 
matrix model, our E becomes a V 2 as usual, and we have 0( \/V 5 lg V). Note that since 
this is greater than V 2 , if the graph is sparse it may be worth first converting to the edge 
list model. 

Classically single-source shortest paths with negative edge weights takes 0(VE) time, 
so SPNW is faster than its classical counterpart for E £ £l(V\gV). 

4.4 All-pairs shortest paths with negative edge weights, APSP 

Theorem 8 Given a graph G = (V, E), the following algorithm APSP returns an array 
whose i,j th element is the length of the shortest path between vertices vi and Vj, oo if no such 
path exists. If there is a negative weight cycle in the graph, instead of an array it returns 
the special value false. It does this in 0(VV 5 lg V) in the matrix model, 0(VV 3 Elg V + 
V 2 lg 3 V) in the edge list model. 

We can do this directly with Johnson's algorithm [71 1111 IT2] . Johnson's works by running 
Dijkstra's algorithm from every origin point, which gives the shortest paths from all points 
to all other points; the difficulty is that Dijkstra's does not work in graphs with negative- 
weight edges, so first it is necessary to reweight edges so that all of their weights are positive. 
That is accomplished through the application of a single Bellman-Ford, which also tells us 
whether there are any negative-weight cycles in the graph. 

In our quantum version, we alter Johnson's by replacing its call to Bellman-Ford with a 
call to section l4~3t s SPNW, and its calls to Dijkstra's algorithm with calls to our modification 
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of Diirr, Heiligman, H0yer and Mhalla's single-source shortest paths ( section 15.11 ^P). The 
SPNW serves to reweight the edges so that they are all positive, and then we run single- 
source shortest paths from each vertex. Our total complexity is the sum of V single- 
source shortest paths and one APSP, which totals to 0(VV 5 lg V) in the matrix model, 
0(VWE\g V + V 2 lg 3 V) in the edge list model. 

Classically all-pairs shortest paths with negative edge weights takes 0(VE + V 2 lgV), 
so APSP is better than its classical counterpart for E G Q(Vlg 3 V). There is another 
classical algorithm, by Zwick[l3 , which runs in 0(V 2 - 575 ); APSP is asymptotically better 
than Zwick's algorithm in the worst case. 

5 Improvements to existing quantum graph algorithms 

It has quickly become to the tradition in the literature p] H] to devise quantum algorithms 
with BBHT as though there were no probability that it could fail, and then to throw a factor 
of log(iV) into the running time at the end to take the probability of failure into account. 
Here we give two examples of algorithms that can be given faster asymptotic behaviour 
with careful error analysis. 

5.1 Single-source shortest paths 

Diirr, Heiligman, H0yer and MhallapQ discuss algorithms for single-source shortest paths, 
minimum spanning tree, connectivity and strong connectivity. The quantum query com- 
plexity for their single-source shortest paths, 0{\/VE\g 2 V), can be improved by using 
mindiff, whereupon it becomes 0(VVE lg V). The explanation follows, and is best enjoyed 
with their paper in hand. 

Step 2(a) in their algorithm involves using what we have called mindiff (see section 123} . 
Their version of it runs in 0{^/Nd) queries to the graph and with constant probability 
of failure; they repeat this logiV times on every call to reduce the probability of failure 
to 1/N. We use our mindiff with Ffej) = length(ey), Gfej) = j instead, which runs in 
V 7 Nd + y 1 ' N\g e _1 queries to the graph. 

Summing as they do to compute running time (in their notation where n = V,m = E) 
we have: YTj=i C\/ sm j + \/ m j ^§ e_1 + s m j lg s ) > which by the Cauchy-Schwartz in- 
equality (and some algebra on the last term) is < (s)(n/s)(m) + y (n/s)(m) (lg e _1 ) + 
n lg s lg(ms jn) which is of order ^/nm (l + ^^)+nlg S lg n. Summing over sizes, where 
s = 1,2,4,... n, we arrive at < ^Jrvm(2 lg n + 2-y/lg e _1 ) + n lg 3 n, which is (returning to our 
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notation) 0(VVElgV + Vlg 3 V). 

Diirr, Heiligman, H0yer and Mhalla do not make some specifics of their version of mindiff 
clear, such as how they maintain the list of their best answers so far. This will inevitably 
add to the total running time of their algorithm (though not its queries to the graph, which 
is what they chose to analyze), and so their total running time ends up as 0(\JVE lg 2 V+7). 

Our total complexity has to include their step 2(b), finding the minimum element of all 
the Ai whose v is not in any Pj. This can be done by keeping a balanced binary search tree 
T with average O(lgiV) insertion/removal/access, which contains all such A{. Every time 
a Pi of size s is changed, we remove the old elements from T and insert the new ones. This 
runs in slgV every time we change a Pi of size s, and each size is created/destroyed no 
more than V/s times, for a total of yigy for each size. Summing over the IgV different 
sizes, we arrive at Vlg 2 V. Thus our total complexity remains 0(^/VE\gV + V\g 3 V). 

The best classical solution to this problem, Dijkstra's algorithm, runs in 0(E + Vlg V), 
so the quantum algorithm is better for E 6 Q(Vlg 3 V). 

5.2 Bipartite matching 

Ambainis and Spalek 4 address bipartite matching, non-bipartite matching and maximum 
flow. Their algorithm for bipartite matching takes 0(Vy/ E + V lg V) time, and is a quan- 
tum adaptation of Hopcroft and Karp's classical 0((E + V)W) algorithm. H]; we solve the 
problem here in 0(Vy/ (E + V) lg V). 

The problem of bipartite matching can be described in several ways: for example, con- 
sider a collection of boys and girls to be vertices of a graph, and have an edge in the graph 
for each (boy, girl) pair that would make a good couple. In bipartite matching, we pair off 
the boys and girls in such a way that only compatible couples are paired, each person has 
at most one partner, and there is a maximum number of pairings. 

Some basic principles underlie most solutions to this problem. Consider some (non- 
maximum) matching-so-far M between boys and girls; if we can construct a path P starting 
at an unmatched boy and ending at an unmatched girl such that all edges in the path are 
either unused boy — > girl edges or used girl — ► boy edges, then the old matching can be 
expanded by 1 more pair by taking M' = M © P (where M © P means taking all edges in 
either M or P, but not both). Intuitively, where M and P have an edge in common, we 
are "unmatching" that (boy, girl) pair, and "rematching" the two using the surrounding 
edges in the path. Because this path augments M by adding one to its size, it is called an 
"augmenting" path. 
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The principle behind Hopcroft and Karp's algorithm is as follows: suppose that every 
time we want to find an augmenting path P, we find the shortest such path. They proved 
that if we do that, we will see at most 2\fV different path lengths in the whole process 
of constructing a maximum matching. So if we devise a process to find a maximal set of 
augmenting paths of minimal length, (maximal means here that the set cannot be expanded 
by adding more paths of the same length) we can repeat that process O(W) times and 
have constructed a maximum matching. 

The construction of a maximal set of augmenting paths of minimal length is accom- 
plished through the use of a breadth-first search and a depth-first search, the details of 
which we leave to our references. They can however be replaced by our BFS and DFS 
functions, giving us a total running time of 0(Vy/E lg V), a whopping y^g V faster than 
Ambainis and Spalek's algorithm. This is also faster than the classical solution, when 

e e n(vigv). 

Ambainis and Spalek also discuss non-bipartite matching and maximum flow in the same 
paper; in both cases they ignore errors for the body of their algorithms, and throw on an 
extra factor of logy at the end in order to reduce the probability of failure to a constant. 
While that works, this section shows that it is not necessarily optimal for bipartite matching; 
and due to the similarity of bipartite matching to the other problems they consider, it is 
reasonable to guess that one could also achieve an 0(\/log V) speedup for general matching 
and flow. 

6 Computational geometry algorithms 

Geometry problems are a natural area of attack for quantum algorithms, because by defining 
N points we have implicitly defined 0(N 2 ) relationships between those points, making it 
very natural to ask questions whose answers require information 0(N 2 ) in the size of the 
question. We will address points as p{. 

In this section, we make reference to the probability of error e but do not discuss it in 
depth. The error analysis for this section can be found in Appendix I A. 31 

6.1 Maximum points on a line, maxpoints 

This problem is, in all of its generality, a very simple one: given N points, find the line that 
goes through the maximum number of them. We differentiate here between a solution that 
is practical for integers [T5] and a slightly slower solution that is practical for real numbers; 
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acknowledging that practical computers, however quantum, do not offer consistent, identical 
normalization for parallel vectors of real numbers. 

Intuitively each algorithm works by taking a single point p and finding out how many 
points are on the best line that goes through p. We then use minfind to find the best such p. 
In the Z n case, our method is to find the vector from p to each other point, canonicalize it 
using GCD, and then stick all those vectors into a hash table so that we can quickly count 
repeats. In the M? case, our method is to sort the points in counterclockwise order about p 
and see look for collinear points, which should now be ordered consecutively. 

This is a particularly interesting problem to solve in Z 2 because it is a member of a 
class of classical problems called "3SUM-hard" [TH] . Of the problems belonging to this class, 
all of the known ones have classical lower-bounds of at most £l(N), and upper bounds of 
at least 0(N 2 ). All problems in the class reduce to the 3SUM problem: given a set S of 
iV integers, is there some triplet a, b, c in that set such that a + b + c = 0? This is quite 
a straightforward problem to solve with findsol in O(N), while we will solve this problem 
in A L5 , opening a gap of iV' 5 between two similar problems, where no such gap existed 
before. This raises interesting questions about the maximum points on a line problem, 
and a number of other problems in 3SUM-HARD. which in turn suggests that many of the 
algorithms in 3SUM-hard (such as maxpoints) may be amenable to sub- A 2 solutions. 

6.2 Maximum points on a line: Z" 

Theorem 9 Let there be N points in Z™, whose coordinates are bounded by ±U. The 
following algorithm maxpoints finds the straight line on which lies the maximum number 
of those points, in 0(N 3 / 2 n lg U y/lg e _1 ) time and with probability of e of failure. 

1. Use section T3.2I S minfind to maximize the following function, mup (maximum using 
p), over all points p. Call the result P. 

2. Function mup: 

(a) Create an empty hash table H, mapping vectors in Z™ (keys) to integers (values). 

(b) For each point pf. 

i. Define ~a = pi — p . 

ii. Normalize a , keeping its entries in the integers, so that the first nonzero 
component is positive and the gcd of the absolute values of the components 
is 1. 
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iii. If a is not yet in H, insert it in H mapping to value 1; if a is already in 
H, increment its value. 

(c) Return the maximum value in H: the number of points on the best line going 
through p. 

3. Run mup on P, but instead of returning the maximum value in the hash table return 
its corresponding key, and call it V . 

4. The answer to return is the line X(t) = P + tV . 

In mup, all vectors to other points from p are canonicalized in such a way that any pair of 
points collinear with p will have the same direction vector a . mup repeats n gcds N times, 
for a total of 0{Nn lg U) , and our main function's most costly operation is one minfind that 
evaluates mup 0(y / A r lge _1 ) times. Thus our total running time is 0(N 3 / 2 nlgU ydge -1 ), 
and our probability of failure is e. Classically the problem can be solved in N 2 n\gU. 

6.3 Maximum points on a line: M 2 

Theorem 10 Let there be N points in M?. The following algorithm finds the straight line 
on which lies the maximum number of those points in 0(N 3 / 2 lgiVydge -1 ), with probability 
of failure e. 

1. Use minfind to maximize the following function, mup2, over all points p. Call the 
result P. 

2. Function mup2: 

(a) Let a-i = pi — p . If ai.x < 0, or m.x = and aj.y < 0, then reverse Oj. This 
puts all points to the right of p. 

(b) Sort the a| as follows: a| < Oj iff (a| x Ej) ■ z > 0. This has the effect of sorting 
the pi in counter-clockwise order about p. 

(c) Iterate over the sorted array, keeping a running total of how many consecutive 
o| have cross product of with one another. Return the maximum such total. 
(Practically, we should see how many consecutive a» have cross product < S for 
some small S, and loop through a second time to catch the nearly-straight-up 
and nearly-straight-down Oj ) . 

3. Run mup2 on P, but instead of returning the maximum total, return some point 
(other than P) on the line giving that total. Call it P' . 
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4. The answer to return is the line X(t) = P + t(P' — P). 

This algorithm sorts the points about each point p, which has the effect of grouping 
collinear points together. Then it simply counts how many consecutive collinear points it 
can find. mup2 is O(NlgN), and our most costly operation is one minfind that evaluates 
mup2 0{yj N lg e _1 ) times, for a total running time of 0(7V 3 / 2 lg N^/lge -1 ) and probability 
of failure e. Classically this problem can be solved in 0(N 2 lgN). 

7 Dynamic Programming algorithms 

Dynamic programming (DP) is a method that solves problems by combining the solutions to 
subproblems. DP algorithms achieve this by partitioning their problems into subproblems, 
solving the subproblems recursively, and then combining the solutions to solve the original 
problem. What distinguishes dynamic programming from other approaches is that the 
subproblems are not independent: subproblems share sub-subproblems with one another. 
A dynamic programming algorithm solves every sub-subproblem only once and saves its 
result in a table, thus eliminating the need to recompute the answer for a sub-subproblem 
every time it is needed. 

Dynamic programming is often used to solve optimization problems. Given some situa- 
tion (a problem), come up with a choice (each possible choice leads to a subproblem) that 
optimizes some final quantity (way down at the sub n -problem level). We will see an exam- 
ple of this in section I77T1 Since DP is often used to make some sort of optimal choice, DP 
algorithms in general are obvious candidates for section I3.2f s minfind, which square-roots 
the process of checking all our options. 

In this section, we assume that the desired probability of failure e is such that e _1 is 
polynomial in the size of the input. In some places this affects the running time, and so we 
make reference to e but do not discuss it in depth. The error analysis for this section can 
be found in Appendix I A. 41 

7.1 Coin changer, coinchange 

Given a monetary system with some set of coins and bills, we may wish to make some 
precise amount of money - the coin changer problem is to use as few coins and bills as 
possible. Intuitively, this is easy: with Canadian or American money, for example, to make 
D cents one can simply take the largest bill/coin of value v < D, then make D — v cents 
in the same way. For example, to make 40$ one would take the largest coin less than 40<p 
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(25$), then the largest coin less than the remaining 15$ (10<t), and finally a 5$ coin. This 
is a greedy approach that works for most real currencies, but it is not always optimal: for 
example, should a 20$ piece be added to the Canadian system, then making 40$ only takes 
two coins, but the greedy approach will still cause us to use three. Should the reader ever 
travel to Costa Rica or Bhutan, he or she will encounter a non-greedy currency system. 

Theorem 11 Given a length C integer array of coin denominations V, as well as an integer 
D, the following algorithm coinchange returns the minimum number of coins required 
to make D units, or oo if making D units of currency is impossible. It achieves this in 
0(Dy/C\gD) time. 

Since we are trying to minimize a quantity, the number of coins used, making D units 
optimally is a matter of choosing one coin V[i] to use, then making D — V[i] units optimally. 
To do so we build up a table T, where T[i] is the minimum number of coins needed to make 
i units. We start by filling in T[i] with i small, since later entries will depend on earlier 
ones. 

1. Let there be an array T of size D + 1, such that initially T[0] = 0, and T[i ^ 0] = oo. 

2. For d from 1 to D, DO: 

(a) Use the algorithm of section I3~2l to minfind one of the coins V[i] such that d — 
V[i] > 0, and 1 + T[d — V[i]] is minimal. 

(b) If such a coin was found, let T[d] = 1 + T[d — V[i]]. 
DONE. 

3. Return T[D]. 

Here we simply fill in the table as discussed above, by using minfind to determine which 
coin should be taken first. The minfind takes 0{\JC\g D) time, and is repeated D times 
for a total time complexity of 0(D\JC lg D). 

The reason we discuss this example is because it is very representative of how one can 
improve dynamic programming algorithms in general using quantum techniques, and as 
such is a good forum for the discussion of quantum DP in general. For example, many 
dynamic programming algorithms, including this one, have alternate recursive implementa- 
tions: rather than consulting entries of a table that have already been filled in, we call our 
function recursively on their indices. Rather than consulting T[x], we call mincoins(x), and 
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it calls mincoins{x — 25) and mincoins(x — 10), etc. To save ourselves from exponential 
repetition, whenever we compute the result for a subproblem we cache it; so that the next 
time mincoins is called with the same parameters, we simply return the result. The advan- 
tage of recursive DP (often called memoization) is that for many people it is very intuitive 
to write a recursive function that computes the result, then throw in a few lines that cache 
and retrieve the cached value. 

Classically, memoization is valuable primarily as an alternate way of implementing dy- 
namic programming; it is only faster in rare cases. Indeed, many DP algorithms are more 
efficient (use less memory) when implemented iteratively, and some few have no clear im- 
plementation through memoization. 

To implement memoization in the quantum case, one could use findsol to find the sub- 
problems whose solutions have not been cached yet, call those recursively, and then take the 
appropriate action, such as a minfind over the subproblems. There is no clear alternative 
to this approach, which is unfortunate: it can lead to asymptotically longer running times 
than standard DP. This is a little tricky to prove, and somewhat outside the scope of this 
paper; for those who are interested, we suggest considering a carefully chosen dependency 
graph such that there is a set X of many states with no dependencies, and an asymptotically 
smaller set Y of states that depend on subsets of X (X might have size iV 6 , Y have size 
iV 4 , and each element of Y could depend on iV 4 elements of X). 

7.2 Maximum subarray sum, subarray-sum 

Theorem 12 Given an N xN array of real numbers A, the following algorithm subarray- 
sum finds a rectangular subarray such that the sum of the subarray 's elements is maximized, 
in 0(N 2 y / lg e _1 ) time and with probability of failure e. We will address the result by its 
limits: (miny, minx, maxy, maxx) . 

This is another classic problem, for which the best known classical solution runs in 



(though still clever) 0(N 3 ) solution, which involves maximizing the sum of all 0(N 2 ) pos- 
sible column ranges, each in 0(N). 

Our algorithm begins by creating a table T that makes checking the sum for an arbitrary 
rectangle 0(1), and then simply minfinds over all rectangles. This algorithm, like the 
classical one, is really greedy rather than dynamic programming; we include it in this 
section because the construction of T is DP. 




was found by TamakijTTf. There is a much more straightforward 
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1. Let there be an N x N array T, whose i,j element will hold the sum for subarray 
(0,0, Initialize its entries to 0, and define = if i or j is negative. The 
next step will fill in T as desired. 

2. For i from to n — 1, For j from to n — 1 DO: 

(a) T[i][j] = A[i\\j] + (T[i - + Z« - 1] - T[i - - 1]). 
DONE. 

3. There are N 4 possible rectangular subarrays. The summation over any such array is 
T[maxy][maxx} — T[maxy][minx — 1] — T[miny — l][maxx] +T[miny — l][minx — 1], 
which is an O(l) calculation. Use the algorithm of section l3~2"l to minfind over all such 
(miny, minx, maxy, maxx) and find the subarray with the maximum summation, and 
then return it. 

The creation of T takes 0(N 2 ), and the minfind takes 0(N 2 y/lg e _1 ) and has probability 
of failure e. The dynamic programming part of this algorithm is the construction of T, which 
could also be implemented using memoization as discussed above. 

8 Conclusions 

We summarize our results from sections EE here. Results from tables |U to |1] should be 
checked against Appendix 1X1 for their exact error-dependence: in the interest of brevity, we 
often assume the probability of error e to be such that e" 1 is polynomial in N (or V), or is 
constant. 



problem 


quantum complexity 


classical (avg) 


finding one solution 


0(^N/M + ^/Nlge- l /M Lm ) 


0(N/M) 


same algorithm, no solutions 


O(^Nlge-i) 


O(N) 


minimum finding 


0(V^Vlge- 1 ) 


O(N) 


finding all M solutions 


0(VNM + ^Nlge- 1 ) 


0(N) 


finding d min. diff. objects 


0( ^Nd + lg e" 1 + d lg N lg d) 


O(N) 



Table 1: Tools. The unit of time is calls to F 



Note that several of our graph algorithms can run more slowly than their classical 
counterparts for E sufficiently small; in each such case there is some a such that the quantum 
algorithm is faster if E G fi(V lg a V). 
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problem 


quantum complexity 


classical 


breadth-first search 


OWVElgV) 


0(E) 


depth-first search 


OWVElgV) 


0(E) 


single src. short, paths (± wt.) 


0(y/V*E]gV) 


0(VE) 


all-pairs short, paths (± wt.) 


0(W^E IgV + V 2 lg 3 V) 


0(V 2 - 575 ) 



Table 2: Graph theory in edge list model: change E to V 2 for matrix model complexity 



problem 


quantum complexity 


classical 




single src. short, paths (+ wt.) 


0(VVE IgV + Vlg 6 V) 


0(E + V\g 


V) 


same, previous quantum 


0(VVE\g 2 V+?) 






bipartite matching 


0(V V / (E + V)lgV) 


0((E + V) 


s/V) 


same, previous quantum 


0(WE + VlgV) 







Table 3: Improvements to quantum graph algorithms from other papers, in edge list model 



problem 


quantum complexity 


classical 


points on a line (Z n ) 


N 3 / 2 n IgU 


N 2 n lg U 


points on a line (M. 2 ) 


AT3/2 lgN 


N 2 IgN 


coin changer 


D^ClgD 


DC 


maximum subarray sum 


N 2 


N 3 



Table 4: Computational geometry and dynamic programming 



In this paper we have chosen to focus on deriving new algorithms rather than proving 
lower bounds. As such, it is possible that the algorithms presented here are not optimal, 
presenting clear directions for future research: searching for lower bounds that approach 
the upper-bounds presented here, and finding faster algorithms. There are few published 
quantum algorithms (at least when viewed in the context of the number of published classical 
algorithms!) so there is limited sport to be had in picking them apart to save factors of 
ydg N ; on the other hand, there is a vast field full of classical algorithms with no quantum 
counterparts, and much of the low-hanging fruit remains untouched. 
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A Detailed error analysis 

Here we present in more exacting detail the parameters e _1 that are passed from function 
to function from section and on, as well as brief (but complete) error analysis, e in this 
appendix will always denote the probability of failure for a function. We pass the parameter 
e _1 rather than e because e _1 is often polynomial in the input, and is thus more convenient 
to discuss. 

A.l Graph algorithms 

Breadth-first search: in section I4.1t s step [2cJ we call findall. It should be called with 
parameter Ve^ 1 , giving the V calls to it probability 1 — e of all succeeding. As this is our 
only function call that may fail, it gives the whole BFS function probability e of failure and 



running time O y\fV^ lg(ye _1 )J in the matrix model, O \^JVE lg(Ve _1 )J in the edge list 
model. 

Depth-first search: in section I4.2f s step I2bl we call findsol. It should be called with 
parameter 2Ve _1 , giving the 2V calls to it (one to find each vertex, one from each vertex to 
find nothing) probability 1— e of all succeeding. As this is our only function call that may fail, 



it gives the whole DFS function probability e of failure and running time O I W V 3 lg(V^e 1 



in the matrix model, O yyVEXgiVe^ 1 )^ in the edge list model. 

Single-source shortest paths with negative edge weights: in section FOl s step EH we call 
minfind. It should be called with parameter V 2 e~ 1 , giving the V 2 calls to it probability 
1 — e of all succeeding. As this is our only function call that may fail, it gives the whole 



SPNW function probability e of failure and running time O I yV^ lg(Ve x ) I in the matrix 



model, O yy^V 3 Elg(Ve 1 )J in the edge list model. 

All-pairs shortest paths: in section 14.41 we call SPNW once and single-source short- 
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est paths V times. Each should be called with parameter (V + l)e -1 , giving the V + 1 
total calls probability 1 — e of all succeeding. As these are our only function calls that 
may fail, they give the whole APSP function probability e of failure and running time 
O (yV^ (lg V + ^IgiVe- 1 )^ + V 2 lg 3 V) in the matrix model, 
O (W 3 E (lg V + y/lgiVe- 1 )^ + V 2 lg 3 v) in the edge list model. 

A. 2 Improvements to existing quantum graph algorithms 

Single-source shortest paths: in section 15.11 we call mindiff. In Diirr, Heiligman, H0yer 
and Mhalla's notationp^, for each size s, we call mindiff n/s times; summing over sizes, 
we have Y^k=o % ^ Switching back to our notation, that means we call it 2V times 
and require success each time, which means it should be called with parameter 2Ve~ 1 , 
giving the V calls to it probability 1 — e of succeeding. As this is our only function 
call that may fail, it gives the whole function probability e of failure and running time 

O (y/VE (lg v + v^glyF 1 )) + V lg 3 v) . 

Bipartite matching: in section 15.21 we call BFS and DFS < 2\[V times each. Each 
should be called with parameter (4Vv)e~ 1 , giving the 4y/V total calls probability 1 — e of all 
succeeding. As these are our only function calls that may fail, they give the whole bipartite 
matching function probability e of failure and running time 0(V y (E + V) lg(Ve -1 )). 

A. 3 Computational geometry algorithms 

Maximum points on a line in Z n : in section E21 we call minfind. It should be called with 
parameter e" 1 , giving the sole call to it probability 1 — e of succeeding. As this is out only 
function call that may fail, it gives the whole function probability e of failure and running 
time OiN^^nlgU^JlgF 1 ). 

Maximum points on a line in IR 2 : in section fo.31 we call minfind. It should be called 
with parameter e -1 , giving the sole call to it probability 1 — e of succeeding. As this is 
out only function call that may fail, it gives the whole function probability e of failure and 
running time 0(N 3 / 2 lg Ny/lg e" 1 ). 

A. 4 Dynamic Programming algorithms 

Coin changer: in section mi s step I2al we call minfind. It should be called with parameter 
De , giving the D calls to it probability 1 — e of all succeeding. As this is our only function 
call that may fail, it gives the whole coinchange function probability e of failure and running 
time OiD^ClgiDe- 1 )). 
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Maximum subarray sum: in section 17.21 s step EJ we call minfind. It should be called 
with parameter e , giving the sole call to it probability 1 — e of succeeding. As this is our 
only function call that may fail, it gives the whole subarray-sum function probability e of 
failure and running time 0(iV Vlge -1 ). 

B Comments and caveats 

We mention here some comments that are important to the content of the paper, but that 
we felt broke up its flow too much to include in the body. 

Asymptotic notation (O, £1 and 0): informally, saying that a function takes G(/(iV)) 
time means that as N goes to infinity, if we take the algorithm's running time and divide 
it by f(N), we will get a nonzero constant; intuitively, that the function takes "order" 
f(N) time to complete. If a function takes 0(f(N)) time, the algorithm's running time is 
upper-bounded by f(N); 0(/(iV)) is a lower-bound. Throughout the paper we somewhat 
informally call our algorithms 0(f(N)), which we do because while the algorithm itself may 
be 0(/(iV)), the existence of that algorithm proves that the problem it solves is 0(f(N)). 
In section 0] we analyze many of our algorithms by saying they are better than the classical 
version for E £ 0,(Vlg a V): this simply means that if we take the size of the graph to 
infinity, the algorithm is better as long as the number of edges goes to infinity at least as 
fast as Vlg a V. 

Types of graph: all graph algorithms presented here assume the graphs they operate on 
will have at most one edge between any two vertices (or two edges in opposite directions, 
in the directed case), and no "self-edges" e aa . Most of these algorithms are very easy to 
generalize to graphs that do not have that property, but in the interests of brevity we do 
not discuss that. 

Large numbers: it is assumed throughout the body of the paper that basic arithmetic 
and addressing operations take constant time. This is not the case as the size of input 
goes to infinity: take for example a graph with 2 100 vertices. Each vertex takes 100 qubits 
to address, and so looking at an edge out of is an 0(\gV) operation. The net effect of 
this is that every algorithm discussed in this paper has an unmentioned lgiV (resp. lgV) 
factor that we have not included in its running time. In the literature when algorithms 
are analyzed, it is often the case that this extra factor is not included; so without opening 
that particular can of worms, we simply acknowledge that there is an extra factor of IgN 
everywhere, without putting it in the body of the paper. Not including the extra factor 
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is the tradition in much of classical computing, and is consistent with other papers on 
quantum algorithms (see for example ^ |H El IU El EH ) • 

C BBHT: probability of failure and running time 

Here we explore, in some detail, the probability of failure and running time of Boyer, Bras- 
sard, H0yer and Tapp's algorithm for quantum searching^: in particular, their algorithm 
that finds one of an unknown number of solutions to a function F. Recall that F maps 
a domain of size N to {0, 1}, and has M solutions x such that F(x) = 1; and that their 
algorithm runs in 0(w N/M) calls to F. The authors discuss average running time, but 
give scant attention to what happens if there is no solution; other papers (see for example 
j!8j ) explore the algorithm in slightly more detail, but not to the degree we would like. Here 
we attempt to encapsulate both average running time and probability of failure, as well as 
the running time's dependence on A, a constant chosen by the authors to be 8/7. 

In this appendix we assume familiarity with Grover's original algorithm 5 . In particular, 
we ask that the reader be comfortable with the following: 

• What it means to run Grover's algorithm with j Grover iterations. 

• Let 6 be such that sin 2 9 = M/N . Then the probability of success when Grover's 
algorithm is run with j Grover iterations is sin 2 ((2j + 1)9). 

C.l The BBHT algorithm 

In the original algorithm, there is no provision for M = 0; in that case, it runs forever. We 
change this by inserting the condition m > (see below), at which point our algorithm 

decides there is no solution and returns false. 

1. Initialize m = 1 and set A = 8/7. 

2. While m < 2\/~N, repeat the following unless instructed to return: 

(a) Choose an integer j uniformly at random such that < j < m. 

(b) Execute Grover's original algorithm, using j Grover iterations. Let the outcome 
be called i. 

(c) If F(i) = 1, return i; otherwise, set m to Am. 

3. Return false. 
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Intuitively, BBHT works by trying several different numbers of Grover iterations, which 
(depending on how many iterations there were) will yield different probabilities of success 
for different values of M. On average the algorithm as a whole will fail with probability 
< .5M~' 93 , as we will see. 

C.2 Probability of failure and running time 

The probability of failure for BBHT is the probability that, for each m up to 2y/~N, Grover's 
algorithm never successfully returns a result when there is one to return. To calculate that 
probability, first we need a result derived by Boyer, Brassard, H0yer and Tapp|2j: first, recall 
that after j Grover iterations, the probability of returning a valid result is sin 2 ((2j + 1)0). 
For a given m, j could be any of ... m — 1, and averaging over those values they arrive at 
a probability of \ + 4msfn^2e) that an invalid result will be returned, for m an integer, m is 
of course not actually an integer, but by choosing a random integer < j < m, we treat it 
as one and can consider it to be one for the purposes of that formula. 

We wish to upper-bound the probability of error for BBHT as a whole, and we will start 
by differentiating between the cases < 9 < f (M < N/2) and f < 6 < § (M > N/2). For 
any M < N/2, we wish to find an mo such that for each repetition of the outer loop when 
m > mo, the probability of failure is less than or equal to some constant. For M > N/2, 
we will find that the probability of failure is always less than or equal to some constant. 

We begin by considering M < N/2. In order to find mo, first we have to find critical 
points of fe(m) = \ + 4^1^26) ' ^ ne probability that an invalid result will be returned: 

dfe{m) = 
dm 

4#cos(4m#) sin(4m#) 
4msin(26>) 4m 2 sin(26*) 

4m# = tan(4m#) 
4m8 = 0,4.49,7.73,... 

Now we consider the form of fg(m). It starts off at /e(0) = \ + sin ^ 26) ) and decreases from 
there; we want to find the first maximum it will return to after dipping down, meaning 
4m# = 7.73. Since < 6 < ?, we use sin(2#) > -9, and arrive at (when 4m0 = 7.73) 

fe(mo) <h + S -T^§ ~ °- 6 - That does not 

give us mo, however: mo is when fo(m) first 
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dips that low. Solving numerically and using sin# < 9: 

= 1 sin(4m 6 > ) 
2 i4m ^ 

4m 9 < 2.78 

m < O.69/sin(0) 



m < Om^N/M 

For j < < t;, although fo{rn) is well-behaved and slowly-oscillating over the space of 
integer values of m, it oscillates wildly in between; so our previous approach, based on 
considering fg as a function acting on the continuum, will not work. To fix this problem, 
instead of considering 9, we now consider the angle eft = ^ — 9; first noting that fo{rn) = 
1 — fcf)(m), meaning that success for 9 corresponds to failure for (f>: 

1 sin(4m6>) 1 sin(4m(§ - (f>)) _ 1 sin(4m</>) 
fati{m) = g + 4msin ( 2 fl) = 2 + 4msin(7r-20) = 2 ~ 4msin(20) 

Now we are back in the elysian realm of < 4> < |, and we can bound the probability of 
failure for cf> from below and use that result. The procedure here is as before, but instead 
of 7.73 we use the first root of tan(4m</>) = Arrup, 4.49. For < | we use sin(2</>) < 2<f>, and 
arrive at (when 4m0 = 4.49) Pf a u > \ + "^'^ ~ 0.39. That is the lowest the probability 
of failure /</,(m) ever gets, and correspondingly it is the lowest the probability of success 
1 — fe( m ) ever gets. 

We now have that, for any given iteration of the outer loop, the probability of failure 
for M > N/2 is less than or equal to 0.61 for all m, and the probability of failure for 



M < N/2 is less than or equal to 0.6 for m > tjiq = 0.69-^/ N/M. We now compute the 
total probability of failure and running time for each case. 

For M > N/2, the total probability of failure is simply 0M l ° s ^ 2 ^ « .5N^^ , and 
the probability of getting to the k th iteration through the main loop is 0.61 fe . The total 
running time, then, is the sum Y^k^ 2 ^^ 4^-(0.61) fc < \ 1 _ Q 1 61A - 

For M < N/2, the total probability of failure is o.6 log ^ 2 ^- log ^ a69 v /] V^) , which gives 



29 



us o.Q^Sxi^VM) ~ (2.8M)-°- 25 / lnA . The running time is the sum: 

log A (0.69^V/M) , log A (2ViV) k 

fc=log A (0.69- v /Af/M) 



t 



k=0 



log A (0.69 A /A r /A/) ^fe 

— dk + 



\og x (2VN) \k ; 

^_ (0 . 6) fe-log A (0.69V^M) dA . 

log A (0.69^/iV/M) 2 



0.69a/ N/M 



2 In A 



+ (0.69a/ Af/M)- log A°' 6 



2y/W 
0.69a/ Af/M 2 



^ a .log A 0.6 



0.69a/ iV/M 



2 In A 



(0.69a/ A/Af)" log * a6 



0.69a/ N/M (3a/M) 1o Sa 0-6 ^_ 



2 In A 



1 + log A 0.6 



~2 1 + log A 0.6 

1 y^N/M 

2 1 + log A 0.6 



2VW 



0.69a/ AT/M 



Since we have a/ N/M dependence from the first term, we should choose A such that the 
second term contributes no worse, which gives us the condition log A 0.6 < 1, or A < 1.64. 
We now have: 

0.69a/N/M 1 y/N/M 



t < 



2 In A 



2 1 + log A 0.6 



which is minimal for A ~ 1.31, and more importantly is 0(v/ N/M). We would like to note 
that this is about 50% faster than Boyer, Brassard, H0yer and Tapp's arbitrary choice of 
A = j, but that is only true in this approximation; not only that, but the optimal value for 
A depends on the value of M/N, so there is no one optimal A in general. 

Using A = 1.31, our results can be summarized in table |SJ Most important to us is 
that our running time is O {y/N/M) calls to F, and our probability of failure is less than 
.5M - ' 93 . It is also worth noting that our earlier restriction, A < 1.64, came because we 
chose a small root for tan(x) = x. If we had chosen a larger root, A could have been larger, 
up to an asymptotic maximum of 2. 



Case 


Probability of Failure 


Average Running Time 


M < N/2 


< .4M~- 93 


< 1.9y/ N/M 


M > N/2 


< .5iY~- 96 


< 2.3 



Table 5: Probability of failure and average running time for BBHT, taking A to be 1.31 
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